Expand description
ort
is an (unofficial) ONNX Runtime 1.16 wrapper for Rust based on the now inactive onnxruntime-rs
. ONNX Runtime accelerates ML inference on both CPU & GPU.
See the docs for more detailed information and the examples
. If you have any questions, feel free to ask in the #💬|ort-discussions
and related channels in the pyke Discord server or in GitHub Discussions.
- Feature comparison
- Cargo features
- How to get binaries
- Execution providers
- Projects using
ort
❤️ - FAQ
- Shared library hell
Feature comparison
Feature comparison | 📕 ort | 📗 ors | 🪟 onnxruntime-rs |
---|---|---|---|
Upstream version | v1.16.0 | v1.12.0 | v1.8 |
dlopen() ? | ✅ | ✅ | ❌ |
Execution providers? | ✅ | ❌ | ❌ |
IOBinding? | ✅ | ❌ | ❌ |
String tensors? | ✅ | ❌ | ⚠️ input only |
Multiple output types? | ✅ | ✅ | ❌ |
Multiple input types? | ✅ | ✅ | ❌ |
In-memory session? | ✅ | ✅ | ✅ |
WebAssembly? | ✅ | ❌ | ❌ |
Cargo features
Note: For developers using
ort
in a library (if you are developing an app, you can skip this part), it is heavily recommended to usedefault-features = false
to avoid bringing in unnecessary bloat. Cargo features are additive. Users of a library that requiresort
with default features enabled will not be able to remove those features, and if the library isn’t using them, it’s just adding unnecessary bloat and inflating compile times. Instead, you should enableort
’s default features in your dev dependencies only. Disabling default features will disabledownload-binaries
, so you should instruct downstream users to includeort = { version = "...", features = [ "download-binaries" ] }
in their dependencies if they need it.
download-binaries
(default): Enables downloading binaries via thedownload
strategy. If disabled, the default behavior will be thesystem
strategy.copy-dylibs
(default): Copies the dynamic libraries to the Cargo build folder - see shared library hell.half
(default): Enables support for usingfloat16
/bfloat16
tensors in Rust.fetch-models
: Enables fetching models from the ONNX Model Zoo. Useful for quick testing with some common models like YOLOv4, GPT-2, and ResNet. Not recommended in production.load-dynamic
: Loads the ONNX Runtime binaries at runtime viadlopen()
without a link dependency on them. The path to the binary can be controlled with the environment variableORT_DYLIB_PATH=/path/to/libonnxruntime.so
. This is heavily recommended, as it mitigates the shared library hell.
How to get binaries
You can use either the ‘traditional’ way, involving a strategy, or the new (and preferred) way, using load-dynamic
.
- Strategies: Links to provided or downloaded dynamic libraries; see below. This is useful for static linking and quick prototyping (making use of the
download
strategy), but might cause more headaches thanload-dynamic
. load-dynamic
: This doesn’t link to any dynamic libraries, instead loading the libraries at runtime usingdlopen()
. This can be used to control the path to the ONNX Runtime binaries (meaning they don’t always have to be directly next to your executable), and avoiding the shared library hell. To use this, enable theload-dynamic
Cargo feature, and set theORT_DYLIB_PATH
environment variable to the path to youronnxruntime.dll
/libonnxruntime.so
/libonnxruntime.dylib
- you can also use relative paths likeORT_DYLIB_PATH=./libonnxruntime.so
(it will be relative to the executable). For convenience, you should download or compile ONNX Runtime binaries, put them in a permanent location, and set the environment variable permanently.
Strategies
There are 2 ‘strategies’ for obtaining and linking ONNX Runtime binaries. The strategy can be set with the ORT_STRATEGY
environment variable.
download
(default): Downloads prebuilt ONNX Runtime from Microsoft. Only a few execution providers are available for download at the moment, namely CUDA and TensorRT. These binaries may collect telemetry. In the future, pyke may provide binaries with telemetry disabled and more execution providers available.system
: Links to ONNX Runtime binaries provided by the system or a path pointed to by theORT_LIB_LOCATION
environment variable.ort
will automatically link to static or dynamic libraries depending on what is available in theORT_LIB_LOCATION
folder.
Execution providers
To use other execution providers, you must explicitly enable them via their Cargo features, listed below. Some EPs are not currently implemented due to a lack of hardware for testing; please open an issue if your desired EP has a ⚠️
- ✅
cuda
: Enables the CUDA execution provider for Maxwell (7xx) NVIDIA GPUs and above. Requires CUDA v11.6+. - ✅
tensorrt
: Enables the TensorRT execution provider for GeForce 9xx series NVIDIA GPUs and above; requires CUDA v11.4+ and TensorRT v8.4+. - ✅
openvino
: Enables the OpenVINO execution provider for 6th+ generation Intel Core CPUs. - ✅
onednn
: Enables the Intel oneDNN execution provider for x86/x64 targets. - ✅
directml
: Enables the DirectML execution provider for Windows x86/x64 targets with dedicated GPUs supporting DirectX 12. - ✅
qnn
: Enables the Qualcomm AI Engine Direct SDK execution provider for Qualcomm chipsets. - ❓
nnapi
: Enables the Android Neural Networks API (NNAPI) execution provider. (needs testing - #45) - ✅
coreml
: Enables the CoreML execution provider for macOS/iOS targets. - ⚠️
xnnpack
: Enables the XNNPACK backend for WebAssembly and Android. - ⚠️
migraphx
: Enables the MIGraphX execution provider AMD GPUs. - ❓
rocm
: Enables the ROCm execution provider for AMD ROCm-enabled GPUs. (#16) - ✅
acl
: Enables the ARM Compute Library execution provider for multi-core ARM v8 processors. - ⚠️
armnn
: Enables the ArmNN execution provider for ARM v8 targets. - ✅
tvm
: Enables the preview Apache TVM execution provider. - ⚠️
rknpu
: Enables the RKNPU execution provider for Rockchip NPUs. - ⚠️
vitis
: Enables Xilinx’s Vitis-AI execution provider for U200/U250 accelerators. - ✅
cann
: Enables the Huawei Compute Architecture for Neural Networks (CANN) execution provider.
Note that the download
strategy only provides some execution providers, namely CUDA and TensorRT for Windows & Linux. You’ll need to compile ONNX Runtime from source and use the system
strategy to point to the compiled binaries to enable other execution providers.
Execution providers will attempt to be registered in the order they are passed, silently falling back to the CPU provider if none of the requested providers are available. If you must know whether an EP is available, you can use ExecutionProvider::cuda().is_available()
.
For prebuilt Microsoft binaries, you can enable the CUDA or TensorRT execution providers for Windows and Linux via the cuda
and tensorrt
Cargo features respectively. Microsoft does not provide prebuilt binaries for other execution providers, and thus enabling other EP features will fail when ORT_STRATEGY=download
. To use other execution providers, you must build ONNX Runtime from source.
Projects using ort
❤️
open a PR to add your project here 🌟
- Twitter uses
ort
to serve homepage recommendations to hundreds of millions of users. - Bloop uses
ort
to power their semantic code search feature. - pyke Diffusers uses
ort
for efficient Stable Diffusion image generation on both CPUs & GPUs. - edge-transformers uses
ort
for accelerated transformer model inference at the edge. - Ortex uses
ort
for safe ONNX Runtime bindings in Elixir.
FAQ
I’m using a non-CPU execution provider, but it’s still using the CPU!
ort
is designed to fail gracefully when an execution provider is not available. It logs failure events through tracing
, thus you’ll need a library that subscribes to tracing
events to see the logs. The simplest way to do this is to use tracing-subscriber
.
Add tracing-subscriber
to your Cargo.toml:
[dependencies]
tracing-subscriber = { version = "0.3", features = [ "env-filter", "fmt" ] }
In your main function:
fn main() {
tracing_subscriber::fmt::init();
}
Set the environment variable RUST_LOG
to ort=debug
to see all debug messages from ort
; this will look like:
- Windows (PowerShell):
$env:RUST_LOG = 'ort=debug'; cargo run
- Windows (Command Prompt): use PowerShell ;)
- macOS & Linux:
RUST_LOG="ort=debug" cargo run
My app exits with “status code 0xc000007b
” without logging anything!
You probably need to copy the ONNX Runtime DLLs to the same path as the executable.
- If you are running a binary (
cargo run
), copy them to e.g.target/debug
- If you are running an example (
cargo run --example xyz
), copy them to e.g.target/debug/examples
- If you are running tests (
cargo test
), copy them to e.g.target/debug/deps
Alternatively, you can use the load-dynamic
feature to avoid this.
“thread ‘main’ panicked at ’assertion failed: (left != right)
”
Most of the time this is because Windows ships its own (typically older) version of ONNX Runtime. Make sure you’ve copied the ONNX Runtime DLLs to the same folder as the exe.
Shared library hell
If using shared libraries (as is the default with ORT_STRATEGY=download
), you may need to make some changes to avoid issues with library paths and load orders, or preferably use the load-dynamic
feature to avoid all of this.
Windows
Some versions of Windows come bundled with an older vesrion of onnxruntime.dll
in the System32 folder, which will cause an assertion error at runtime:
The given version [14] is not supported, only version 1 to 13 is supported in this build.
thread 'main' panicked at 'assertion failed: `(left != right)`
left: `0x0`,
right: `0x0`', src\lib.rs:114:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
The fix is to copy the ONNX Runtime DLLs into the same directory as the binary, since DLLS in the same folder as the main executable resolves before system DLLs. ort
can automatically copy the DLLs to the Cargo target folder with the copy-dylibs
feature, though this fix only works for binary Cargo targets (cargo run
). When running tests/benchmarks/examples for the first time, you’ll have to manually copy the target/debug/onnxruntime*.dll
files to target/debug/deps/
for tests & benchmarks or target/debug/examples/
for examples.
Linux
Running a binary via cargo run
should work without copy-dylibs
. If you’d like to use the produced binaries outside of Cargo, you’ll either have to copy libonnxruntime.so
to a known lib location (e.g. /usr/lib
) or enable rpath to load libraries from the same folder as the binary and place libonnxruntime.so
alongside your binary.
In Cargo.toml
:
[profile.dev]
rpath = true
[profile.release]
rpath = true
# do this for all profiles
In .cargo/config.toml
:
[target.x86_64-unknown-linux-gnu]
rustflags = [ "-Clink-args=-Wl,-rpath,\\$ORIGIN" ]
# do this for all Linux targets as well
macOS
macOS has the same limitations as Linux. If enabling rpath, note that the rpath should point to @loader_path
rather than $ORIGIN
:
# .cargo/config.toml
[target.x86_64-apple-darwin]
rustflags = [ "-Clink-args=-Wl,-rpath,@loader_path" ]
Re-exports
pub use self::environment::Environment;
pub use self::error::OrtApiError;
pub use self::error::OrtError;
pub use self::error::OrtResult;
pub use self::execution_providers::ExecutionProvider;
pub use self::io_binding::IoBinding;
pub use self::memory::AllocationDevice;
pub use self::memory::MemoryInfo;
pub use self::session::InMemorySession;
pub use self::session::Session;
pub use self::session::SessionBuilder;
pub use self::tensor::NdArrayExtensions;
pub use self::value::Value;
Modules
- Types and helpers for handling ORT errors.
- Contains the
Session
andSessionBuilder
types for managing ONNX Runtime sessions and performing inference. - Module containing tensor types.
Enums
- Execution provider allocator type.
- ONNX Runtime provides various graph optimizations to improve performance. Graph optimizations are essentially graph-level transformations, ranging from small graph simplifications and node eliminations to more complex node fusions and layout optimizations.
- The minimum logging level. Logs will be handled by the
tracing
crate. - Memory types for allocated memory.
Functions
- Attempts to acquire the global
sys::OrtApi
object.